Project Final Report

Team Members

Project Description

Project Goal & Social Problem

We have determined the earthquake, which is one of the natural disasters that can be devastating and unpredictable, especially in regions that are not used to earthquakes.

The aim of this project is to understand whether there is a relationship between earthquakes in the world. In this direction, historical, regional and trigger links between earthquakes were sought.

Project data & access to data

We knew that our dataset selection was important in order to make earthquake data more meaningful, so we chose the United States Geological Survey to access worldwide data, and Boğaziçi University Kandilli Observatory and Earthquake Research Institute to access data specific to Turkey. For this purpose, we used the earthquake data of the USGS and KOERI for the years 2016-2020.

The datasets were easily obtained in the web interface thanks to the API provided by the USGS and KOERI. The data used in the analysis consists of data with a magnitude >2.5 in order to increase accuracy and avoid confusion.

Actions taken

Warning: In this section, we’ve included all the code blocks we used to be self-explanatory. In the oral presentation file, you can only see the document with insights

Within the scope of the project, we first tried to clean the data we imported from the USGS and KOERI sites. Because they were included in the dataset for uncertain earthquakes, we had to exclude them so that they do not affect the analysis. When importing the data, it made our job very easy as we got the size >2.5. In the next process, we cleaned ~2k lines of missing data. We reclassified the variables by data types and looked at their statistics for numeric variables to give us an idea. We then decided on the visualizations that we thought might be useful and tried to draw them.

Install libraries

#Prerequisites
install.packages("maps")
install.packages("ggpubr")
install.packages("kableExtra")

Loading libraries

library(tidyverse)#for data manipulation
library(lubridate)#for formatting date and time
library(kableExtra)#for printing tables
library(readxl)#for reading excel file
library(ggplot2) #for graphs
library(maps) #for world map
library(ggpubr)#for density function
library(leaflet)#for creating map widgets
library(viridis)#for color palettes

Loading datasets

Dataset for the earthquake occured in Turkey has been obtained from Kandilli Observatory and Earthquake Research Institute (KOERI) Database Search. Data is retrieved as txt format, then pasted to an excel(xlsx) file.

USGS (United States Geological Survey) Search Catalog is the other website we will use for the details of the world-wide earthquake data.

#KOERI dataset
turkey_earthquake <- read_excel("data/boun.xlsx")

#USGS datasets
data2016_1 <- read.csv("data/query 2016-1.csv")
data2016_2 <- read.csv("data/query 2016-2.csv")
data2017_1 <- read.csv("data/query 2017-1.csv")
data2017_2 <- read.csv("data/query 2017-2.csv")
data2018_1 <- read.csv("data/query 2018-1.csv")
data2018_2 <- read.csv("data/query 2018-2.csv")
data2018_3 <- read.csv("data/query 2018-3.csv")
data2019_1 <- read.csv("data/query 2019-1.csv")
data2019_2 <- read.csv("data/query 2019-2.csv")
data2020_1 <- read.csv("data/query 2020-1.csv")
data2020_2 <- read.csv("data/query 2020-2.csv")

Introduction to Datasets

KOERI Dataset

KOERI(Kandilli Observatory and Earthquake Research Institute) Earthquake Catalog is the website we will use for detailed Turkey earthquake data.

turkey_tidyquake <- turkey_earthquake %>% 
                      select(No,
                             Event_ID = `Deprem Kodu`,
                             Date = `Olus tarihi`,
                             Origin_Time = `Olus zamani`,
                             Latitude = Enlem,
                             Longitude = Boylam,
                             Depth_km = `Der(km)`,
                             Mag = xM,
                             Type = Tip) %>% 
                      filter(Type == "Ke")
head(turkey_tidyquake)
## # A tibble: 6 x 9
##      No Event_ID Date      Origin_Time         Latitude Longitude Depth_km   Mag
##   <dbl>    <dbl> <chr>     <dttm>                 <dbl>     <dbl>    <dbl> <dbl>
## 1     1  2.02e13 2020.07.~ 1899-12-31 16:28:07     38.5      27.5     16.5   2.5
## 2     2  2.02e13 2020.07.~ 1899-12-31 12:55:40     38.5      27.5      8.6   3  
## 3     3  2.02e13 2020.07.~ 1899-12-31 12:45:17     36.7      28.2     67.1   2.5
## 4     4  2.02e13 2020.07.~ 1899-12-31 12:25:11     38.5      27.5      3.8   2.6
## 5     5  2.02e13 2020.07.~ 1899-12-31 10:44:28     38.5      27.5      8.6   3  
## 6     6  2.02e13 2020.07.~ 1899-12-31 09:59:45     38.5      27.5     13.3   3.8
## # ... with 1 more variable: Type <chr>

The parameters and their explanations in this data are given below:

Param Name Description
No Event Sequence
Event ID Unic ID for event [YYYYMMDDHHMMSS (YearMonthDayHourMinuteSecond)]
Date Date of event specified in the following format YYYY.MM.DD (Year.Month.Day)
Origin Time Origin time of event (UTC) specified in the following format HH:MM:SS.MS
Latitude in decimal degrees
Longitude in decimal degrees
Depth(km) Depth of the event in kilometers
Mag Magnitude for the event
Type Earthquake (Ke) or Suspected Explosion (Sm)
Location Nearest settlement
USGS Dataset

USGS (United States Geological Survey) Search Catalog is the other website we will use for the details of the world-wide earthquake data. We obtained the datasets annually as seperate files, therefore these datasets have to be merged.

rawData <- rbind(data2016_1, 
                data2016_2, 
                data2017_1, 
                data2017_2, 
                data2018_1, 
                data2018_2, 
                data2018_3, 
                data2019_1, 
                data2019_2, 
                data2020_1, 
                data2020_2)
head(rawData)
##                       time latitude longitude  depth mag magType nst   gap
## 1 2016-06-30T23:35:00.100Z  17.9638  -68.5780 65.000 3.8      Md  18 194.4
## 2 2016-06-30T23:04:59.230Z  36.7601  137.9208 10.300 4.5      mb  NA  42.0
## 3 2016-06-30T22:51:19.680Z  16.7209  146.3311 74.260 4.4      mb  NA  77.0
## 4 2016-06-30T22:25:03.700Z  36.4771  -98.7412  7.694 3.5     mwr  NA  51.0
## 5 2016-06-30T22:20:10.570Z  17.3387  147.4923 54.440 4.3      mb  NA 171.0
## 6 2016-06-30T21:47:49.290Z  -2.6333  139.0413 51.200 4.1      mb  NA  87.0
##       dmin  rms net         id                  updated
## 1 1.118403 0.56  pr pr16182026 2016-08-31T03:08:24.040Z
## 2 0.312000 0.56  us us10005yyt 2016-08-31T03:08:24.040Z
## 3 1.561000 0.81  us us1000619a 2016-08-31T03:08:24.040Z
## 4       NA 0.25  us us10005yy6 2016-08-31T03:08:24.040Z
## 5 2.650000 0.82  us us1000619h 2016-08-31T03:08:24.040Z
## 6 6.790000 0.72  us us1000619b 2016-08-31T03:08:24.040Z
##                                            place       type horizontalError
## 1    45 km S of Boca de Yuma, Dominican Republic earthquake             2.7
## 2                       8 km NE of Hakuba, Japan earthquake             3.3
## 3 177 km NNE of Saipan, Northern Mariana Islands earthquake             8.8
## 4                  17 km SE of Waynoka, Oklahoma earthquake             1.2
## 5  299 km NE of Saipan, Northern Mariana Islands earthquake            10.1
## 6                 176 km W of Abepura, Indonesia earthquake             6.6
##   depthError magError magNst   status locationSource magSource
## 1        7.1    0.000     15 reviewed             pr        pr
## 2        4.1    0.045    147 reviewed             us        us
## 3        6.8    0.096     31 reviewed             us        us
## 4        1.7       NA     12 reviewed            tul       slm
## 5        9.5    0.153     12 reviewed             us        us
## 6        8.7    0.130     16 reviewed             us        us

Removing useless columns and saving the all_data

data <- select(rawData, -c("nst","id","updated"))

write.csv(data,"data/all_data.csv")

Data Structures

KOERI
str(turkey_tidyquake)
## tibble [14,517 x 9] (S3: tbl_df/tbl/data.frame)
##  $ No         : num [1:14517] 1 2 3 4 5 6 7 8 9 10 ...
##  $ Event_ID   : num [1:14517] 2.02e+13 2.02e+13 2.02e+13 2.02e+13 2.02e+13 ...
##  $ Date       : chr [1:14517] "2020.07.01" "2020.07.01" "2020.07.01" "2020.07.01" ...
##  $ Origin_Time: POSIXct[1:14517], format: "1899-12-31 16:28:07" "1899-12-31 12:55:40" ...
##  $ Latitude   : num [1:14517] 38.5 38.5 36.7 38.5 38.5 ...
##  $ Longitude  : num [1:14517] 27.5 27.5 28.2 27.5 27.5 ...
##  $ Depth_km   : num [1:14517] 16.5 8.6 67.1 3.8 8.6 13.3 5 15.4 12.8 10.6 ...
##  $ Mag        : num [1:14517] 2.5 3 2.5 2.6 3 3.8 2.6 2.5 2.8 2.5 ...
##  $ Type       : chr [1:14517] "Ke" "Ke" "Ke" "Ke" ...

As seen above, date is in character class. It is convenient to convert it to date class for our data manipulations.

turkey_tidyquake$Date <- as.Date(turkey_tidyquake$Date, format = "%Y.%m.%d")
str(turkey_tidyquake$Date)
##  Date[1:14517], format: "2020-07-01" "2020-07-01" "2020-07-01" "2020-07-01" "2020-07-01" ...
#Reading all data from one source
data <- read.csv("data/all_data.csv")
str(data)
## 'data.frame':    152012 obs. of  20 variables:
##  $ X              : int  1 2 3 4 5 6 7 8 9 10 ...
##  $ time           : chr  "2016-06-30T23:35:00.100Z" "2016-06-30T23:04:59.230Z" "2016-06-30T22:51:19.680Z" "2016-06-30T22:25:03.700Z" ...
##  $ latitude       : num  18 36.8 16.7 36.5 17.3 ...
##  $ longitude      : num  -68.6 137.9 146.3 -98.7 147.5 ...
##  $ depth          : num  65 10.3 74.26 7.69 54.44 ...
##  $ mag            : num  3.8 4.5 4.4 3.5 4.3 4.1 4.3 4.5 2.6 4.9 ...
##  $ magType        : chr  "Md" "mb" "mb" "mwr" ...
##  $ gap            : num  194 42 77 51 171 ...
##  $ dmin           : num  1.118 0.312 1.561 NA 2.65 ...
##  $ rms            : num  0.56 0.56 0.81 0.25 0.82 0.72 1.27 1.02 0.58 1.22 ...
##  $ net            : chr  "pr" "us" "us" "us" ...
##  $ place          : chr  "45 km S of Boca de Yuma, Dominican Republic" "8 km NE of Hakuba, Japan" "177 km NNE of Saipan, Northern Mariana Islands" "17 km SE of Waynoka, Oklahoma" ...
##  $ type           : chr  "earthquake" "earthquake" "earthquake" "earthquake" ...
##  $ horizontalError: num  2.7 3.3 8.8 1.2 10.1 6.6 5.7 9.6 NA 14.4 ...
##  $ depthError     : num  7.1 4.1 6.8 1.7 9.5 8.7 5.7 1.9 0.3 1.9 ...
##  $ magError       : num  0 0.045 0.096 NA 0.153 0.13 0.201 0.059 NA 0.032 ...
##  $ magNst         : int  15 147 31 12 12 16 7 85 NA 301 ...
##  $ status         : chr  "reviewed" "reviewed" "reviewed" "reviewed" ...
##  $ locationSource : chr  "pr" "us" "us" "tul" ...
##  $ magSource      : chr  "pr" "us" "us" "slm" ...

The parameters and their explanations for this data are given below:

Param Name Description
time Time when the event occurred. Times are reported in milliseconds since the epoch
latitude Decimal degrees latitude. Negative values for southern latitudes
longitude Decimal degrees longitude. Negative values for western longitudes
depth Depth of the event in kilometers
mag The magnitude for the event
magType The method or algorithm used to calculate the preferred magnitude for the event
nst The total number of seismic stations used to determine earthquake location
gap The largest azimuthal gap between azimuthally adjacent stations (in degrees)
dmin Horizontal distance from the epicenter to the nearest station (in degrees)
rms The root-mean-square (RMS) travel time residual, in sec, using all weights
net The ID of a data contributor
id A unique identifier for the event
updated Time when the event was most recently updated
place Textual description of named geographic region near to the event
type A comma-separated list of product types associated to this event
horizontalError Uncertainty of reported location of the event in kilometers
depthError Uncertainty of reported depth of the event in kilometers
magError Uncertainty of reported magnitude of the event
magNst The total number of seismic stations used to calculate the magnitude for this earthquake
status Indicates whether the event has been reviewed by a human
locationSource The network that originally authored the reported location of this event
magSource Network that originally authored the reported magnitude for this event

Checking the earthquake types

Since the data which is retrieved from USGS contains all natural disasters, the type contains earthquake is filtered for our work. First, we need to check type column contains our work interest which is the earthquake.

unique(data$type)
##  [1] "earthquake"                 "mining explosion"          
##  [3] "other event"                "experimental explosion"    
##  [5] "explosion"                  "mine collapse"             
##  [7] "rock burst"                 "quarry blast"              
##  [9] "nuclear explosion"          "ice quake"                 
## [11] "landslide"                  "sonic boom"                
## [13] "collapse"                   "volcanic eruption"         
## [15] "induced or triggered event" "Ice Quake"

we only interest with earthquakes so we have to remove the others from the dataset

In this way, we get rid of 2k rows of unnecessary data.

data <- filter(data, type=="earthquake")

Let’s classify earthquakes according to their magnitudes in order to use them in graphs.

#KOERI
turkey_tidyquake <- turkey_tidyquake %>% 
                      mutate(magClass = cut(Mag, breaks=c(2.4,4,5,6,7,9),
                                            labels=c("2.5-4", "4-5", "5-6", "6-7", "7-9")))
#USGS
data<- mutate(data, magClass=cut(data$mag, breaks=c(2.4, 4, 5, 6, 7, 9), labels=c("2.5-4", "4-5", "5-6", "6-7", "7-9")))

Let’s format the time column and separate it as year-month-date

#KOERI
turkey_tidyquake <- turkey_tidyquake %>% 
                      mutate(Year = year(Date),
                             Month = month(Date))

#USGS
data$time <- strptime(data$time, format = "%Y-%m-%dT%H:%M:%OSZ")
data$year <- format(data$time, format="%Y")
data$month <- format(data$time, format="%m")
data$date <- format(data$time, format="%Y%m%d")

Quick stats summary check at numeric variables

sapply(data[,names(which(sapply(data, class) == "numeric"))],summary)
## $latitude
##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
## -82.8837   0.0914  19.4058  20.2736  42.3552  87.3860 
## 
## $longitude
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## -180.00 -155.26  -94.22  -49.60   74.00  180.00 
## 
## $depth
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   -3.60    9.00   11.91   55.31   46.62  679.12 
## 
## $mag
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   2.500   2.800   3.600   3.661   4.400   8.200 
## 
## $gap
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##     7.0    66.0   112.0   129.7   186.0   359.0   14300 
## 
## $dmin
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##   0.000   0.148   0.961   2.183   2.586 127.420   18523 
## 
## $rms
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##  0.0000  0.2800  0.5900  0.5932  0.8400 46.2400       5 
## 
## $horizontalError
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##   0.000   1.300   5.700   5.819   8.800  99.000   14206 
## 
## $depthError
##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max.     NA's 
##    0.000    0.800    2.000    4.737    7.200 2329.800        7 
## 
## $magError
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##   0.000   0.078   0.129   0.249   0.200   5.670   22369
data16 <- data %>% 
           filter(year == 2016)
data17 <- data %>% 
           filter(year == 2017)
data18 <- data %>% 
           filter(year == 2018)
data19 <- data %>% 
           filter(year == 2019)
data20 <- data %>% 
           filter(year == 2020)
factpal <- colorNumeric(plasma(5,direction = -1), data$mag) 
factpal16 <- colorNumeric(plasma(5,direction = -1), data16$mag)
factpal17 <- colorNumeric(plasma(5,direction = -1), data17$mag)
factpal18 <- colorNumeric(plasma(5,direction = -1), data18$mag)
factpal19 <- colorNumeric(plasma(5,direction = -1), data19$mag)
factpal20 <- colorNumeric(plasma(5,direction = -1), data20$mag)
mapm <- leaflet(map_data("world")) %>% 
          addTiles() %>% 
          addCircles(~data16$longitude, ~data16$latitude, stroke = F, radius = 5, group = "2016", color = ~factpal(data16$mag)) %>% 
          addCircles(~data17$longitude, ~data17$latitude, stroke = F, radius = 5, group = "2017", color = ~factpal(data17$mag)) %>% 
          addCircles(~data18$longitude, ~data18$latitude, stroke = F, radius = 5, group = "2018", color = ~factpal(data18$mag)) %>% 
          addCircles(~data19$longitude, ~data19$latitude, stroke = F, radius = 5, group = "2019", color = ~factpal(data19$mag)) %>% 
          addCircles(~data20$longitude, ~data20$latitude, stroke = F, radius = 5, group = "2020", color = ~factpal(data20$mag)) %>% 
          addLegend(position = "bottomright", pal = factpal, values = ~data$mag, title ="Magnitudes") %>% 
          addLayersControl(overlayGroups = c("2016", "2017", "2018", "2019", "2020"),
                           options = layersControlOptions(collapsed = FALSE))
          
          
mapm

As seen above, earthquakes that occurred between 2016 and 2020 has been pointed on the map. Atlantic Ocean fault line can be observed on the map.

Distribution of earthquakes on Turkey according to their magnitude

tr_data16 <- turkey_tidyquake %>% 
           filter(Year == 2016)
tr_data17 <- turkey_tidyquake %>% 
           filter(Year == 2017)
tr_data18 <- turkey_tidyquake %>% 
           filter(Year == 2018)
tr_data19 <- turkey_tidyquake %>% 
           filter(Year == 2019)
tr_data20 <- turkey_tidyquake %>% 
           filter(Year == 2020)
tr_factpal <- colorNumeric(plasma(5,direction = -1), turkey_tidyquake$Mag) 
tr_factpal16 <- colorNumeric(plasma(5,direction = -1), tr_data16$Mag)
tr_factpal17 <- colorNumeric(plasma(5,direction = -1), tr_data17$Mag)
tr_factpal18 <- colorNumeric(plasma(5,direction = -1), tr_data18$Mag)
tr_factpal19 <- colorNumeric(plasma(5,direction = -1), tr_data19$Mag)
tr_factpal20 <- colorNumeric(plasma(5,direction = -1), tr_data20$Mag)
tr_map <- leaflet(map_data("world")) %>% 
          addTiles() %>% 
          addCircles(~tr_data16$Longitude, ~tr_data16$Latitude, stroke = T, radius = 5, group = "2016", color = ~factpal(tr_data16$Mag)) %>% 
          addCircles(~tr_data17$Longitude, ~tr_data17$Latitude, stroke = T, radius = 5, group = "2017", color = ~factpal(tr_data17$Mag)) %>% 
          addCircles(~tr_data18$Longitude, ~tr_data18$Latitude, stroke = T, radius = 5, group = "2018", color = ~factpal(tr_data18$Mag)) %>% 
          addCircles(~tr_data19$Longitude, ~tr_data19$Latitude, stroke = T, radius = 5, group = "2019", color = ~factpal(tr_data19$Mag)) %>% 
          addCircles(~tr_data20$Longitude, ~tr_data20$Latitude, stroke = T, radius = 5, group = "2020", color = ~factpal(tr_data20$Mag)) %>% 
          addLegend(position = "bottomright", pal = tr_factpal, values = ~turkey_tidyquake$Mag, title ="Magnitudes") %>% 
          addLayersControl(overlayGroups = c("2016", "2017", "2018", "2019", "2020"),
                           options = layersControlOptions(collapsed = FALSE))
          
          
tr_map

As seen above, earthquakes are occurred often on northwest to southwest of Turkey. Also, we can see the earthquakes are common on The North Anatolian Fault.

Pie Chart

counts <- data %>% 
            group_by(year) %>% 
            summarise(n=n())

tr_counts <- turkey_tidyquake %>% 
              group_by(Year) %>% 
              summarise(n=n())

labels <- c("World", "Turkey")
counts16 <- c(counts$n[1], tr_counts$n[1])
counts17 <- c(counts$n[2], tr_counts$n[2])
counts18 <- c(counts$n[3], tr_counts$n[3])
counts19 <- c(counts$n[4], tr_counts$n[4])
counts20 <- c(counts$n[5], tr_counts$n[5])

percentage16<- round(100*counts16/sum(counts16), 1)
percentage17<- round(100*counts17/sum(counts17), 1)
percentage18<- round(100*counts18/sum(counts18), 1)
percentage19<- round(100*counts19/sum(counts19), 1)
percentage20<- round(100*counts20/sum(counts20), 1)
pal1=c("#A6DBA0","#B2182B")
par(mfrow=c(2,3))
pie(counts16, col=pal1, xlab="2016", labels = percentage16)
pie(counts17, col=pal1, xlab="2017", labels = percentage17)
mtext(side = 3, text = "Earthquake Percentages by Years")
pie(counts18, col=pal1, xlab="2018", labels = percentage18)
pie(counts19, col=pal1, xlab="2019", labels = percentage19)
pie(counts20, col=pal1, xlab="2020", labels = percentage20)
plot.new()
legend("bottomright",legend=c("World","Turkey"), fill = pal1)

par(mfrow=c(1,1))
mean_world <- counts %>% 
                summarise(count_avg=mean(n))

mean_tr <- tr_counts %>%
            summarise(count_avg=mean(n))

counts_mean <- c(as.numeric(mean_world), as.numeric(mean_tr))
percentage_mean <- round(100*counts_mean/sum(counts_mean), 1)
pie(counts_mean, labels = percentage_mean, col = pal1, main = "Mean Earthquake Percentages between 2016-2020")

Distribution of The Number of Earthquakes by Years.

As seen below, more than 20 thousand earthquakes occurred each year from 2016 to 2020. In 2018, there were nearly twice as many earthquakes occurred, compared to 2017. This is the highest count of earthquakes in these 5 years.

year<-data %>% group_by(year) %>% tally()
p <- ggplot(year) + geom_bar(aes(x=year, y=n, fill = as.factor(n)), stat="identity")+
      scale_fill_brewer(palette = "Set1")+
      theme_minimal()+
      theme(legend.position = "none")
p <- p +
      ggtitle("Distribution of Earthquake Counts in The World by Years") +
      xlab("Years") + ylab("Counts")
p

In Turkey, there are more than one thousand earthquakes happened each year from 2016 to 2020. In 2017, more than 5000 of earthquakes occurred in Turkey, which is a peak in these 5 years.

year_turkey <- turkey_tidyquake %>%
                group_by(Year) %>% 
                tally()
tr_yearthquake <- ggplot(year_turkey) + geom_bar(aes(x = Year, y = n, fill = as.factor(n)),
                                                 stat = "identity")+
                  scale_fill_brewer(palette = "Set1")+
                  theme_minimal()+
                  theme(legend.position = "none")+
                  ylab("Counts")
tr_yearthquake <- tr_yearthquake + ggtitle("Distribution of Earthquake Counts in Turkey by Years")
tr_yearthquake

Distribution of The Number of Earthquakes by Months.

There are more than 10 thousand earthquakes observed in the world each year from 2016 to 2020. The count of earthquakes increased on Summer. Therefore there could be a relationship between temperature and the earthquakes.

library(RColorBrewer)
month<-data %>% group_by(month) %>% tally()
world_monthquake <- ggplot(month) + geom_bar(aes(x=month, y=n, fill= as.factor(n)), stat="identity") +
     scale_fill_manual(values = colorRampPalette(brewer.pal(9,"Set1"))(12))+
     theme_minimal()+
     theme(legend.position = "none")
world_monthquake <- world_monthquake + ggtitle("Distribution of Earthquake Counts in The World by Months") + xlab("Months") + ylab("Counts")
world_monthquake

In Turkey, we also see the number of earthquakes are increased on Summer season, while the most of earthquakes occurred in January and February.

tr_monthquake <- turkey_tidyquake %>% 
                  group_by(Month) %>% 
                  tally()
tr_month_plot <- ggplot(tr_monthquake) + geom_bar(aes(x=Month, y=n, fill= as.factor(n)),
                                                  stat = "identity") +
                  scale_fill_manual(values = colorRampPalette(brewer.pal(9,"Set1"))(12))+
                  scale_x_continuous(breaks = c(1:12))+
                  theme_minimal()+
                  theme(legend.position = "none")
tr_month_plot <- tr_month_plot + 
                  ggtitle("Distribution of Earthquake Counts in Turkey by Months")
tr_month_plot

Distribution of The Earthquakes by years and magnitude classes.

year<-data %>% group_by(year, magClass) %>% tally()
year%>%ggplot(aes(year, n))+
geom_point(size=1, col="red")+
  facet_wrap(~magClass,  ncol=2, scales="free")+
   ggtitle("Number of Earthquakes by Magnitude Class and Year") +
           xlab("Year") + ylab("Number of Cases")+
  theme(plot.title = element_text(face="bold", size=14, hjust=0.5)) +
theme(axis.title = element_text(face="bold", size=12))

year<-turkey_tidyquake %>% group_by(Year, magClass) %>% tally()
year%>%ggplot(aes(Year, n))+
geom_point(size=1, col="red")+
  facet_wrap(~magClass,  ncol=2, scales="free")+
   ggtitle("Number of Earthquakes by Magnitude Class and Year") +
           xlab("Year") + ylab("Number of Cases")+
  theme(plot.title = element_text(face="bold", size=14, hjust=0.5)) +
theme(axis.title = element_text(face="bold", size=12))

Let’s examine the distribution of earthquakes on Earth by months and magnitude classes.

month<-data %>% group_by(month, magClass) %>% tally()
month%>%ggplot(aes(month, n))+
geom_point(size=1, col="red")+
  facet_wrap(~magClass,  ncol=2, scales="free")+
   ggtitle("Number of Earthquakes by Magnitude Class and Month") +
           xlab("Months") + ylab("Number of Cases")+
  theme(plot.title = element_text(face="bold", size=14, hjust=0.5)) +
theme(axis.title = element_text(face="bold", size=12))

Density Plots

ggdensity(data$mag, 
          main = "Density Plot of Magnitude in The World",
          xlab = "Magnitude")

ggdensity(turkey_tidyquake$Mag,
          main = "Density plot of magnitude in Turkey",
          xlab = "Magnitude")

Results and Discussion

As a result of our research, we have reached: We learned the averages of earthquake magnitudes in the world. On the world map, we have seen that earthquakes are more intense on coastlines. Compared to other years, we saw that the number of earthquakes in 2018 was almost doubled. We observed an increase in the number of earthquakes in the summer months. This increase seems especially 2.5-4 magnitude range.

Thanks to our project, we obtained separate insights from earthquake data of the World and Turkey. We checked whether the earthquake relations are a link between the world and Turkey. We compared the world averages with Turkey, which is known as the earthquake zone. In this process, we determined that we could make comparisons on the basis of size, hourly, seasonal, seasonal (sea/terrestrial) and we focused on these factors in our research accordingly.

Conclusion

As a result, we used data transfer, cleaning, reconstruction according to data types and basic visualization processes in the analysis of earthquake data in the world and Turkey between the years 2016-2020, which we obtained from USGS and KOERI organizations, which we identified as reliable data sources. In this way, we provided the opportunity to visually see whether there is a similarity between the earthquakes that took place in the world and in Turkey. In this way, we tried to figure out whether our country, which we refer to as an earthquake zone, is really an above-average earthquake zone when compared to other countries in the world.

You can also access our project’s GitHub page here: Statistics of earthquake hazards in Turkey and comparison with the world

References